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Note 

The directory http://deleeuwpdx.net/pubfolders/homals has a pdf copy of this article, the complete Rmd file 
with all code chunks, and R and C files with the code. 

Loss Function 

In the multivariate analysis techniques presented in this paper the data are measurements or categorizations 
of n objects by p variables. Variables are partitioned into m sets of variables , with set j containing pj variables. 
Thus J2jLiPj = P- The number of variables in a set is also called the rank of the set. 

Homogeneity Analysis, as defined in this paper, is the minimization of the loss function 

, m 

a(X,H,A) = -Y J SSQ( x ~ H j A j), (1) 

m j=i 


where SSQ() is the sum of squares. 

The loss function depends on three sets of parameters X, H and A. One of them, the matrix X, is common 
to all sets. The data are in the matrix Hj, which codes observations on the pj variables in set j. Homogeneity 
analysis does not operate on the variables directly, but on admissible transformations of the variables. Matrix 
Aj has coefficients used to make linear combinations of the variables in set j. 

1. The n x r matrix X of object scores is required to be column-centered and normalized by X'X = I. 
The dimensionality r is chosen by the user. 

2. The n x pj matrices of transformed variables Hj have columns hj(, with £ = 1, • • • ,Pj. We require 
hjt £ JC\g fl S, where ICje is a cone of admissible transformations and S is the unit sphere in R n . For 
example, K.j( can be the cone of all centered vectors in R n that are monotone with the data. 

3. The pj x r matrices Aj of component loadings are unconstrained. 

Thus we make linear combinations of the transformed variables in such a way that the set scores HjAj 
are as close as possible to the object scores X, and consequently as close as possible to each other. This 
explains the name of the technique: we want to make the m set scores as homogeneous as possible. Note 
that homogeneity analysis is both linear and non-linear, in the sense that it makes linear combinations of 
non-linear transformations of the variables. 

The history of loss function (1) is complicated. Least squares and eigenvalue methods for quantifying 
multivariate qualitative data were introduced by Guttman (1941). They were taken up by, among others, 
De Leeuw (1968) and by Benzecri and his students (Cordier 1965). In this earlier work the emphasis was 
often on optimizing quadratic forms, or ratios of quadratic forms, and not so much on least squares, distance 
geometry, and so-called biplots (Gower and Hand 1996). 

In De Leeuw (1974) a first attempt was made to unify most classical descriptive multivariate techniques 
using a single least squares loss function and a corresponding alternating least squares (ALS) optimization 
method. Guttman’s quantification method, which later became known as multiple correspondence analysis, 
was merged with linear and nonlinear principal component analysis in the HOMALS/PRINCALS techniques 
and programs (De Leeuw and Van Rijckevorsel 1980). The homogeneity analysis loss function that was 
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chosen ultimately, for example in the work of Van der Burg, De Leeuw, and Verdegaal (1988), had been used 
earlier by Carroll (1968) in multi-set canonical correlation analysis. 

In the Gifi system (Gih 1990, Michailidis and De Leeuw (1998)) a slightly different parametrization, and a 
correspondingly different ALS algorithm, were used. The loss function used by Gifi is 

i m Pj 

a(X, Y) = - J2 SSQ (X - ]T (2) 

TO i=i t =l 

where the Gjn are known spanning matrices for the cones of admissible transformations, and the Yjf are 
kji x p matrices of category quantifications. By using the full rank decompositions Y.p = ZjjAji we can show 
that (1) and (2) are essentially the same. We feel that the new parametrization in terms of Hj and and A 3 
has some conceptual and computational advantages. 

Algorithm 

The standard way to minimize loss function (2) is implemented in the OVERALS program (Van der Burg, De 
Leeuw, and Verdegaal 1988, Meulman and Heiser (2011)). It is also the one used in the homals package (De 
Leeuw and Mair 2009). 

In this paper the algorithm is different because we use the loss function (1). We still use ALS, which means 
in this case that we cycle through three substeps in each iteration. We update A for given X and H , we then 
update X for given H and A, and finally we update H for given X and A. Algorithm A goes as follows. 

0. Set k = 0 and start with some X^°\ H^°\ A)°b 

1 . X ( fe+1 ) = ortho(center(E”li Hj k%> A^)). 

2. For j = 1, • • • , m compute A^ k+1 ^ = {Hj kr> } + X( k+1 \ 

3. For j = 1, • - • , m and s = 1 ,-"Pj compute hf s +1) = proj KjsnS ((X( fc+1 ) - Et< s hf t +1) {af t +1) }' - 

Zt>eh${a$ +1) y)ai k+1) ). 

4. If converged stop. Else k <— k + 1 and go to step 1. 

In step 1 we use superscript + for the Moore-Penrose inverse. In step 2 the center operator does column 
centering, the ortho operator finds an orthonormal basis for the column space of its argument. 

The complicated part is step 4, the optimal scaling , i.e. the updating of Hj for given X and Aj. We cycle 
through the variables in the set, each time projecting a single column on the cone of admissible transformations 
of the variable, and then normalizing the projection to length one. The target, i.e. the vector we are projecting, 
is complicated, because the other variables in the same set must be taken into account. 

In order to simplify the optimal scaling computations within an iteration we can use majorization (De Leeuw 
1994, Heiser (1995), Lange, Hunter, and Yang (2000), J. De Leeuw (2015a)). This has the additional benefit 
that the optimal scaling step becomes embarassingly parallel. We expand the loss for set j around a previous 
solution Hj. 

SSQ(X - HjAj) = SSQf A - HjAj) - 2tr (Hj - Hfi)\X - H j A j )A'j + tr A' (Hj - Hj)'(Hj - Hj)A r 
Now 

tr (Hj - Hj)AjA'j(Hj - Hj)' < Kj tr (Hj - H 3 )'(H 3 - Hj), 
where Kj is the largest eigenvalue of A) A j. Thus 

SSQ(X - HjAj) < SSQ(A - HjAj) + Kj SSQ (Hj - Uj) - — SSQ((A - H J A j )A l j ), 

Kj 

where Uj is the target 

Uj=Hj + ^(X-HjA j )A'j. (3) 

K j 
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It follows we can update the optimal scaling of the variables by projecting the columns of Uj on their 
respective cones and then normalizing. See De Leeuw (1975) for results on normalized cone regression. This 
can be done for all variables in the set separately, without taking any of the other variables in the set (or in 
any of the other sets) into account. Thus the optimal scaling is easy to parallellize. The resulting algorithm 
B is as follows. 

0. Set k = 0 and start with some , H^°\ A^°k 

1. X( k+1 ) = ortho(center(JO”( =1 A^)). 

2. For j = 1, • • • , m compute A^ k+1 ^ = {Hj k/> } + X^ k+1 \ 

3. For j = 1, • • • ,m compute Uj k+1 ^ = + A-(W( fe+1 ) — Hj k ' 1 A < ' k+1 ^){A < ' k+1) }' and for s = 1, • • -pj 

compute hf s +1) = proj^ . sn5 (u^ +1) ). 

4. If converged stop. Else k <— k + 1 and go to step 1. 

Implementation Details 

If we follow the ALS strategy strictly the ortho() operator should be implemented using Procrustus rotation 
(Gibson 1962). Thus if Z = KAL 1 is the singular value decomposition of X, then ortho(Z) = KL'. Note, 
however, that any other basis for the column space of Z merely differs from the Procrustus basis by a rotation. 
And this rotation matrix will carry unmodified into the upgrade of Aj in step 2 of the algorithm, and thus 
after steps 1 and 2 the loss will be the same, no matter wlrclr rotation we select. In our algorithm we use the 
QR decomposition to find the basis, using the Gram-Schmidt code from J De Leeuw (2015). 

We implement the cone restrictions by the constraints hj S = Gj s z s in combination with Tj S hj S > 0. Thus the 
transformed variables must be in the intersection of the subspace spanned by the columns of the transformation 
basis Gj s and the polyhedral convex cones of all vectors h such that Tj S h > 0. We suppose that all columns 
of the Gj s add up to zero, and we require, in addition, the normalization SSQ(hj s ) = 1. 

In earlier homogeneity analysis work, summarized for example in Gifi (1990) or Michailidis and De Leeuw 
(1998), the transformation basis matrices Gj S were binary zero-one matrices, indicating category membership. 
The same is true for the software in IBM SPSS Categories (Meulman and Heiser 2011) or in the R package 
homals (De Leeuw and Mair 2009). In this paper we extend the current homogeneity analysis software using 
B-spline bases, which provide a form of fuzzy non-binary coding suitable for both categorical and numerical 
variables (Van Rijckevorsel and De Leeuw 1988). These generalizations were already discussed in De Leeuw, 
Van Rijckevorsel, and Van der Wouden (1981) and Gifi (1990), but corresponding easily accessible software 
was never released. 

We use the code described in J. De Leeuw (2015b) to generate B-spline bases. Note that for coding purposes 
binary indicators are B-splines of degree zero, while polynomials are B-splines without interior knots. Also 
note that binary indicators can be created for qualitative non-numerical variables, for which B-splines are not 
defined. We have added the option using degree -1 to bypass the B-spline code and generate an indicator 
matrix. Throughout we first ortlronormalize the basis matrices Gj S , using the Gram-Schmidt code from J De 
Leeuw (2015). 

The matrices Tj S in the homogeneous linear inequality restrictions that define the cones ICj S can be used to 
define monotonicity or convexity of the resulting transformations. In the current implementation we merely 
allow for monotonicity, which means the Tj S do not have to be stored. The transformations for each variable 
can be restricted to be increasing, or they can be unrestricted. By using splines without interior knots we 
allow in addition for polynomial transformations, which again can be restricted to be either monotonic or 
not. This covers the previous Gifi types nominal , ordinal , and numerical , which were of course designed for 
categorical variables with a small number of categories. Note that it is somewhat misleading to say we are 
fitting monotone splines or polynomials, we are mainly requiring monotonicity at the data points. 

Missing data are incorporated in the definition of the cones of transformations by using a Gj S which is the 
direct sum of a spline basis for the non-missing and an identity matrix for the missing data. This is called 
missing data multiple in Gifi (1990). There are no linear inequality restrictions on the quantifications of the 
missing data. 
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Associated Eigenvalue Problems 

Associated with the problem of minimizing loss function (1) are some eigen and singular value problems 
defined by the matrices Hj. This has been discussed extensively in Gifi (1990), and there are some more 
recent discussions in Tenenhaus and Tenenhaus (2011) and Van der Velden and Takane (2012). 

Suppose H+ is the Moore-Penrose inverse of Hj and Pj = HjH+ is the orthogonal projector associated with 
the column space of Hj. Then 

m 

min cr(X, H,A) = — y^(r — tr X'PjX) = r — tr X'P+X, 

3 3 — 1 

with P* the average of the projectors Pj. Thus 

r 

min min a(X, H 1 A) = r — A S (P*), 


where X s are the ordered eigenvalues of P* (from large to small). Thus we see that homogeneity analysis 
chooses the Hj, i.e. transforms or quantifies the variables, in such a way that the sum of the r largest 
eigenvalues of P* is maximized. 


Now consider alternative restrictions where we do not normalize X, but we normalize the loadings A by 
requiring that 


1 

m 


m 


3 =1 


= 1 , 


where Dj = H'jHj. Also define Cj V = H'jH v . We can collect the matrices Cj V in an p x p super-matrix C, 
which we we will call the Burt matrix. Note that if H = [Hi H 2 ■ ■ • //„,.], then C = H’H. The matrix 

D is defined as the direct sum of the Dj, i.e. it consists of the diagonal submatrices of C. 


The minimum of loss over unrestricted X for fixed A and H is attained at the average of the set scores 
x = b HjA.j, and thus 

min cr(X, H,A)=r -^-tr A'CA, 

x m z 

and 

r 

min A'DA = ml min cr(X, H,A) = r — A S (C', mD), 

S=1 

where the A s are now the generalized eigenvalues of the pair C and mD. 


If we define K by 

K = mri [Hi(H[Hi)-l ■ ■ ■ H m {H' m H m )- 1] , 

with ( H'jHj)~i the Moore-Penrose inverse of the symmetric square root. Then P* = KK' and the non-zero 
eigenvalues of P* are the same as those of K'K, which in turn are equal to the generalized eigenvalues of the 
pair ( C,mD ). Thus homogeneity analysis can also be interpreted as transforming the variables in such a way 
that the sum of the p largest generalized eigenvalues of ( C, mD) is maximized. 


Special Cases 

1. If there are only two sets the generalized eigenvalue problem for the Burt matrix becomes 


'D 1 

Ci 2 


ai 

= 2A 

'Di 

0 ' 


ax 

C 2 i 

d 2 _ 


_a 2 _ 

0 

D 2 




which we can rewrite as 

Ci2d2 = (2A — l)Diai, 
C 21 CH = (2A — \) D 2 a 2 , 
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from which we see that homogeneity analysis maximizes the sum of the r largest canonical correlations 
between Hi and H 2 . See also Van der Velden (2012). 

2. Suppose all to sets each contain only a single variable. Then the Burt matrix is the correlation matrix 
of the H,j, which are all n x 1 matrices in this case. It follows that homogeneity analysis maximizes 
the sum of the r largest eigenvalues of the correlation matrix over transformations, i.e. homogeneity 
analysis is nonlinear principal component analysis (De Leeuw 2014). 

3. Suppose all basis matrices Gjt in set j are the same, say equal to Gj. Then the set scores HjAj are 
equal to GjZjAj , which we can write simply as GjYj. Thus loss must be minimized over A' and the 
Y,j. If all Gj are binary indicators of categorical variables, and the to sets are all of rank one, then 
homogeneity analysis is multiple correspondence analysis (MCA). The set scores GjYj are kj different 
points, with kj the number of categories of the variable, usually much less than n. The plot connecting 
the set scores to the object scores is called the star plot of the variable. 

4. More generally, we can include an arbitrary number of copies of a variable in a set by using the same 
basis matrix Gj a number of times. As soon as we have decided how many copies to include, the 
algorithm can forget all about the fact that some variables are copies and just treat them like any other 
variable. The notion of copies replaces the notion of the rank of a quantification used in traditional 
Gifi, which in turn generalizes the distinction between single and multiple quantifications. 

5. If the second set only contains a single copy of a single variable then we choose transformations that 
maximize the multiple correlation of that variable and the variables in the first set. 

6. If the second set contains more than one copy of a single variable and we use binary indicator coding for 
that variable, then we optimize the eigenvalue (between/within ratio) sums for a canonical discriminant 
analysis. 

MCA Example: Thirteen Personality Scales 

Our first example is a small data set from the psych package (Revelle 2015) of five scales from the Eysenck 
Personality Inventory, five from a Big Five inventory, a Beck Depression Inventory, and State and Trait 
Anxiety measures. 

data (epi.bfi, package = "psych") 
epi <- epi.bfi 

epi_knots <- lapply (epi, function (x) fivenum (x) [2 : 4] ) 

epi_degrees <- rep (0, 13) 

epi_ordinal <- rep (FALSE, 13) 

epi_copies <- rep (2,13) 

epi_sets <- 1:13 

We perform a two-dimensional MCA, using degree zero and inner knots at the three quartiles for all 13 
variables. 

h <- homals(epi, epi_knots, epi_degrees, epi_ordinal, epi_sets, epi_copies, verbose = FALSE) 
We have convergence in 260 iterations to loss 0.7478043. The object scores are in figure 8. 
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dim 1 


Figure 8: Personality Scales, Object Scores, Multiple Nominal, Degree Zero 

Figure 9 has the GjYj for each of the thirteen variables, with the first dimension in red, and the second 
dimension in blue. Because the degree of the splines is zero, these transformation plots show step functions, 
with the steps at the knots, which are represented by vertical lines. 
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Figure 9: Personality Scales, Transformations, Multiple Nominal, Degree Zero 
The thirteen star plots are in figure 9. 
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Figure 10: Personality Scales, Star Plots, Multiple Nominal, Degree Zero 

Now change the degree to two for all variables, i.e. fit piecewise quadratic polynomials which are differentiable 
at the knots. We still have two copies for each variable, and these two copies define the sets. 

epi_degrees <- rep (2, 13) 

h <- homals(epi, epi_knots, epi_degrees, epi_ordinal, epi_sets, epi_copies, verbose = FALSE) 

We have convergence in 785 iterations to loss 0.7179135. The object scores are in figure 11 and the 
transformation plots in figure 12. 
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Figure 11: Personality Scales, Object Scores, Multiple Nominal, Degree Two 
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Personality Scales, Transformations, Multiple Nominal, Degree Two 
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NLPCA Example: Thirteen Personality Scales 

We use the same data as before for an NLPCA with all sets of rank one, all variables ordinal, and splines of 
degree 2. 
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library (nnls) 

epi_copies <- rep (1, 13) 

epi_ordinal <- rep (TRUE, 13) 

h <- homals(epi, epi_knots, epi_degrees, epi_ordinal, epi_sets, epi_copies, verbose = FALSE) 

In 19 iterations we find minimum loss 0.7330982. The object scores are in figure 13 and the transformation 
plots in figure 14. NLPCA maximizes the sum of the two largest eigenvalues of the correlation matrix of the 
variables. Before transformation the eigenvalues are 4.0043587, 2.6702003, 1.9970912, 0.8813983, 0.6571463, 
0.6299946, 0.5246896, 0.4657022, 0.3457515, 0.3403361, 0.2767531, 0.1835449, 0.0230333, after transformation 
they are 4.1939722, 2.7454868, 1.604906, 0.8209072, 0.7184825, 0.677183, 0.51865, 0.4545214, 0.4200148, 
0.351787, 0.2928574, 0.1699557, 0.0312759. The sum of the first two goes from 6.674559 to 6.9394591. 



diml 

Figure 13: Personality Scales, Object Scores, Single Ordinal, Degree Two 
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Figure 14: Personality Scales, Transformations, Single Ordinal, Degree Two 

We repeat the analysis with ordinal variables of degree two, without interior knots. Thus we the transformation 
plots will be quadratic polynomials that are monotone over the range of the data. 

epi_knots <- lapply (1:13, function (j) numeric (0)) 

h <- homals(epi, epi_knots, epi_degrees, epi_ordinal, epi_sets, epi_copies, verbose = FALSE) 


In 20 iterations we find minimum loss 0.7393666. The object scores are in figure 15 and the transformation 
plots in figure 16. The eigenvalues are now 4.0828642, 2.6936186, 1.8391342, 0.8732231, 0.6666505, 0.6491709, 
0.5390077, 0.459182, 0.3632868, 0.3471175, 0.2845394, 0.1782232, 0.023982, with sum of the first two equal to 
6.7764828. 
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Figure 15: Personality Scales, Object Scores, Single Numerical, Degree Two 



16 




















bfopen 


bdi 


traitanx 


statanx 


Figure 16: Personality Scales, Transformations, Single Numerical, Degree Two 


Regression Example: Gases with Convertible Components 

We analyze a regression example, using data from Neumann, previously used by Willard Gibbs, and analyzed 
with regression in a still quite readable article by Wilson (1926). Wilson’s analysis was discussed and modified 
using splines in Gifi (1990, 370-76). In the regression analysis in this section we use two copies of temperature, 
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with spline degree zero, and the first copy ordinal. For pressure and the dependent variable density we use a 
single ordinal copy with spline degree two. 

data (neumann, package = "homals") 

neumann_knots <- lapply (neumann, function (x) fivenum (x)[2:4]) 

neumann_degrees <- c(0,2,2) 

neumann_ordinal <-c(TRUE, TRUE, TRUE) 

neumann_copies <- c(2,l,l) 

neumann_sets <- c(l,l,2) 

h <- homals (neumann, neumann_knots, neumann_degrees, neumann_ordinal, neumann_sets, neumann_copies, itm; 


In 47 iterations we find minimum loss 0.0268055, corresponding with a multiple correlation of 0.8956526. The 
object scores are in figure 17 plotted against the original variables (not the transformed variables), and the 
transformation plots in are figure 18. 
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Figure 17: Gases with Convertible Components, Objects Scores 
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Figure 18: Gases with Convertible Components, Transformations 


Discriminant Analysis Example: Iris data 

The next example illustrates (canonical) discriminant analysis, using the obligatory Anderson-Fisher iris data. 
Since there are three species of iris, we use two copies for the species variable. The other four variables are in 
the same set, they are transformed using piecewise linear monotone splines with five knots. 

data(iris, package="datasets") 
iris_vars <- names (iris) 
iris [[5]] <- as.numeric (iris [[5]]) 
iris_knots <- as.list (1:5) 

for (i in 1:4) iris_knots[[i]] <- quantile (iris[[i]], (1:5) / 6) 
iris_knots[[5]] <- 1:3 
iris_degrees <- c (1,1,1,1,0) 

iris_ordinal <- c (TRUE, TRUE, TRUE, TRUE, FALSE) 
iris_copies <- c (1,1,1,1, 2) 
iris_sets <- c(l,l,l,l,2) 

h<-homals (iris, iris_knots,iris_degrees,iris_ordinal,iris_sets,iris_copies, verbose = FALSE) 

In 126 iterations we find minimum loss 0.0307911. The object scores are in figure 19 plotted against the 
original variables (not the transformed variables), and the transformation plots are in figure 20. 
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Figure 19: Iris Data, Objects Scores 



Sepal.Length 





Figure 20: Iris Data, Transformations 

Discriminant analysis decomposes the total dispersion matrix T into a sum of a between-groups dispersion 
B and a within-groups dispersion W, and then finds directions in the space spanned by the variables for 
which the between-variance is largest relative to the total variance. Homogeneity analysis optimizes the sum 
of the r largest eigenvalues of T~ 1 B. Before optimal transformation these eigenvalues for the iris data are 
0.9698722, 0.2220266, after transformation they are 0.9789787, 0.7874823. 
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Multiset Canonical Correlation Example: Thirteen Personality Scales 

This is the same example as before, but now we group the five scales from the Eysenck Personality Inventory 
and the five from the Big Five inventory into sets. The remaining three variables define three separate sets. 
No cpies are used, and we use monotone cubic splines with the interior knots at the quartiles. 

epi_knots <- lapply (epi, function (x) fivenum (x)[2:4]) 

epi_degrees <- rep (3, 13) 

epi_sets <- c (1 , 1 , 1 , 1 , 1 , 2, 2 , 2 ,2, 2,3,4 , 5) 

h <- homals(epi, epi_knots, epi_degrees, epi_ordinal, epi_sets, epi_copies, verbose = FALSE) 

In 196 iterations we find minimum loss 0.4724286. The object scores are in figure 21 and the transformation 
plots in figure 22. 
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Figure 21: Personality Scales, Multi-Set, Objects Scores 
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Figure 22: Personality Scales, Multi-Set, , Transformations 


E 
o 
c n 
c 

05 


Appendix: Code 

source ("gs.R") 

source ( "bsplineBasis.R" ) 

library (nnls) 

homals <- 

function (data, 
knots, 
degrees, 
ordinal, 

sets = l:ncol(data) , 
copies = rep (1, ncol(data)), 
ndim = 2, 
itmax = 1000, 
eps = le-6, 
seed = 123, 
verbose = TRUE) { 
nvars <- ncol (data) 
nobs <- nrow (data) 
nsets <- max (sets) 
ncops <- sum (copies) 
lmax <- ndim * nsets 

indices <- homalslndices (nvars, nsets, ordinal, sets 
setup <- 

homalsSetup (data, knots, degrees, ordinal, sets , 
initials <- 

homalslnitials ( 
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20 50 80 

statanx 


, copies) 

copies, indices$expand) 
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setup$q, 
setup$r, 
nobs, 
nsets, 
ncops, 
ndim, 

indices$expand, 
indices$nvarsets, 
indices$ordicops, 
seed 

) 

fold <- 0 
itel <- 1 

setcopies <- indices$setcopies 
h <- initials$h 
a <- initials$a 
x <- initials$x 
for (s in 1 insets) -f 

hh <- h[l:nobs, which (setcopies == s), drop = FALSE] 
aa <- a[which (setcopies == s), lindim, drop = FALSE] 
fold <- fold + sum ((x - hh 1*1 aa) ~ 2) 

> 

fold <- fold / lmax 
repeat { 

xz <- matrix (0, nobs, ndim) 
fnew <- fmid <- 0 
for (s in 1 nsets) { 

id <- which (setcopies == s) 
hh <- h[l nobs, id, drop = FALSE] 

If <- lm.fit (hh, x) 
aa <- If$coefficients 
rs <- lf$residuals 

kappa <- max (eigen (crossprod (aa))$values) 
fmid <- fmid + sum (rs ~ 2) 
target <- hh + tcrossprod (rs, aa) / kappa 
k <- 1 

for (1 in id) -[ 

ql <- setup$q[[l]] 
vl <- setup$v[[l]] 
if (indices$ordicops[1]) { 

hz <- drop (crossprod (ql, target[, k])) 
ns <- nnls (vl, hz) 
rz <- coefficients (ns) 

hk <- drop (ql °/ 0 *°/ 0 (hz - drop (vl 1*1 rz))) 

> 

else { 

hk <- ql 1*1 crossprod (ql, target[, k]) 

} 

hh[, k] <- hk / sqrt (sum (hk 2)) 
k <- k + 1 

> 

ha <- hh 1*1 aa 
xz <- xz + ha 
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fnew <- fnew + sum ((x - ha) ~ 2) 
h[l:nobs, id] <- hh 
a [id, l:ndim] <- aa 

> 

fmid <- fmid / lmax 
fnew <- fnew / lmax 
if (verbose) 
cat ( 

"Iteration: ", 

formate (itel, width = 3, format = "d"), 

"fold: ", 
formate ( 
fold, 

digits = 8, 
width = 12, 
format = "f" 

), 

"fmid: ", 
formate ( 
fmid, 

digits = 8, 
width = 12, 
format = "f" 

), 

"fnew: ", 
formate ( 
fnew, 

digits = 8, 
width = 12, 
format = "f" 

), 

"\n" 

) 

if ((itel == itmax) II ((fold - fnew) < eps)) 
break 

itel <- itel + 1 
fold <- fnew 
x <- gs (center (xz))$q 

> 

return (list ( 
f = fnew, 
ntel = itel, 
x = x, 
a = a, 
h = h 

)) 

> 

homalslndices <- function (nvars, nsets, ordinal, sets, copies) { 
ordicops <- logical (0) 
expand <- numeric (0) 
for (j in 1 nvars) -[ 

ordicops <- c (ordicops, ordinal [j], rep (FALSE, copies[j] - 1)) 
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expand <- c (expand, rep (j, copies [j])) 


> 

setcopies <- sets [expand] 
nvarsets <- numeric (0) 
for (s in 1 nsets) { 

nvarsets <- c (nvarsets, sum (copies [which (sets == s)])) 

> 

return (list ( 

ordicops = ordicops, 
setcopies = setcopies, 
expand = expand, 
nvarsets = nvarsets 

)) 

> 

homalsSetup <- 
function (data, 
knots, 
degrees, 
ordinal, 
sets, 
copies, 
expand) { 

nvars <- ncol (data) 
nobs <- nrow (data) 
ncops <- sum (copies) 
g <- b <- q <- r <- v <- listO 
for (1 in l:ncops) ■[ 
j <- expand[1] 
mm <- is.na(data[, j]) 
nm <- !mm 
dm <- data[nm, j] 
if (degrees [j] < 0) { 

gn <- ifelse (outer (dm, unique (dm), "=="), 1, 0) 

> 

else { 

gn <- bsplineBasis (dm, degrees[j], knots[[j]]) 

> 

nr <- ncol (gn) 
ns <- sum (mm) 

gg <- matrix (0, nobs, nr + ns) 
gg[!mm, 1: nr] <- gn 
if (ns > 0) { 

gg[mm, nr + (l:ns)] <- diag (ns) 

> 

gg <- center (gg)[,-l, drop = FALSE] 

g <- c (g, list (gg)) 

bb <- gs (g[[l]]) 

q <- c(q, list (bb$q)) 

r <- c (r, list (bb$r)) 

vv <- makeV (data[[j]], q[[l]]) 

v <- c (v, list (vv)) 

} 
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return (list (q = q, r = r, v = v)) 

> 

homalslnitials <- 
function (q, 
r, 

nobs, 
nsets, 
ncops, 
ndim, 
expand, 
nvarsets, 
ordicops, 
seed) { 
set.seed (seed) 

x <- matrix (rnorm (nobs * ndim), nobs, ndim) 
x <- gs (center (x))$q 

a <- matrix (rnorm (ncops * ndim), ncops, ndim) 
h <- matrix (0, nobs, ncops) 
for (1 in l:ncops) { 
ql <- q [ [1] 1 
rl <- r [ [1] ] 
if (ordicops[1]) 
cf <- l:ncol (ql) 
else 

cf <- rnorm (ncol (ql)) 
h[, 1] <- drop (ql %*% rl %*“/« cf) 

} 

h <- center (h) 

h <- apply (h, 2, function (z) 
z / sqrt (sum (z ~ 2))) 
return (list (x = x, a = a, h = h)) 

> 

makeV <- function (x, g) { 

r <- order (x) [1: length(which( ! is .na(x) ))] 
n <- length (r) 
m <- ncol (g) 
v <- numeric (0) 
k <- 0 

for (i in 1: (n - 1) ) -( 
ri <- r[i] 
rj <- r [i + 1] 

if (is. na(x [ri]) II is ,na(x [rj])) 
next 

if (x [rj] > x [ri] ) { 

v <- c (v, g[ri,] - g[rj ,] ) 
k <- k + 1 

> 

> 

return (t (matrix (v, k, m, byrow = TRUE))) 
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center <- function (x) { 

return (apply (x, 2, function (z) 
z - mean (z))) 

> 

normalize <- function (x) { 

return (x / sqrt (sum (x “ 2))) 

> 

Appendix: NEWS 

016: 06/28/15 

• added figure captioning and numbering 

• separated computation and plotting chunks 

• hlines around figures 

• rescaled small example 

• added table of contents 

• redid figures and computations in single chapter 

• added NEWS section 

• added link to pdf and Rmd versions 

017: 06/29/15 

• added nominal, ordinal, numerical 

• added ALS algorithm for ordinal 

• expanded NLPCA section 

018: 06/30/15 

• more ALS for ordinal 

• worked on NLPCA example 

019: 07/01/15 

• minor editing changes throughout 

• introduced QR decomposition of indicators earlier 

020: 07/02/15 

• transformation plots for epi example (still a bug in ordinal single) 

021: 07/06/15 

• squashed the bug in ordinal 

• added TO DO appendix 

• code now has inner iterations 

022: 07/07/13 

• added single numerical example 

• some NLPCA theory added 

023: 07/08/15 

• center the basis matrices in the code 
024: 07/08/15 

• added star plots for MCA degree zero 

• added graphics : yes and keep tex: yes to the YAML to please pandoc 

• material on sets of variables 


28 



025: 07/10/15 

• material on copies 
027: 07/11/15 

• some editing 

• copies algorithm 

028: 07/12/15 

• majorization part of copies algorithm 
029: 07/13/15 

• majorization part of copies algorithm 
031: 07/21/15 

• new copies-based code wrapped up and inserted 

• redid all examples and plots with the new homals program 

• made ftp directory 

032: 07/21/15 

• added neumann example 
033: 07/21/15 

• minor edits 

• normalized loss 

034: 07/22/15 

• extensive editing 

• itermax default to 1000 

• got rid of details of “old” algorithm 

• introduce copies much earlier 

035: 07/24/15 

• (X,H,A) loss replacing (X,Y) loss earlier 

• transformation plots become line plots 

• added stepPlotter function for discrete case 

• redid discrete plots with stepPlotter 

036: 07/27/15 

• described copies algorithm earlier and in more detail 

• added caches for homals solutions 

037: 08/02/15 

• added seed to homals parameter list for homalslnitials 

• fixed homalsSetup bug for dim(g[[j]])[2]=2 (added DROP=FALSE) 

038: 08/03/15 

• various edits 

• move copies loss even more to the beginning 

• number algorithms 

039: 08/04/15 

• added iris example 
040: 08/05/15 
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• added multiset CCA example 
041: 08/19/15 

• complete rewrite, remove old notation, Gifi approach 
042: 08/20/15 

• still working on the copies rewrite 
043: 09/27/15 

• reorganizing and rewriting 
044: 10/01/15 

• reorganizing and rewriting 
045: 10/15/15 

• reorganizing and rewriting 
046: 10/20/15 

• reorganizing and rewriting 
047: 10/21/15 

• implementation details added 
048: 10/23/15 

• cut out some unnecessary stuff 
049: 10/25/15 

• special cases expanded 

• loss geometry eliminated for now 

050: 11/01/15 

• needs more testing, but added primary/secondary ties to ordinal 

• may not work yet, but provisions for missing data 

051: 11/02/15 

• missing data implemented, needs more testing 
052: 11/05/15 

• some edits 

• degree = -1 now means categorical variable, makes indicator without calling B-spline code 

• primary/secondary ties for each variable separately 

053: 11/07/15 

• eliminate secondary approach to ties, which does not make much sense 
054: 11/08/15 

• some code changes to deal with categorical variables 

• reformat code 

100: 11/10/15 

• some final edits 
101: 11/10/15 

• added some before and after statistics 
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102: 11/12/15 

• lots of small edits 
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